Bayesian profile regression with an application to the National Survey of Children's Health.
نویسندگان
چکیده
Standard regression analyses are often plagued with problems encountered when one tries to make inference going beyond main effects using data sets that contain dozens of variables that are potentially correlated. This situation arises, for example, in epidemiology where surveys or study questionnaires consisting of a large number of questions yield a potentially unwieldy set of interrelated data from which teasing out the effect of multiple covariates is difficult. We propose a method that addresses these problems for categorical covariates by using, as its basic unit of inference, a profile formed from a sequence of covariate values. These covariate profiles are clustered into groups and associated via a regression model to a relevant outcome. The Bayesian clustering aspect of the proposed modeling framework has a number of advantages over traditional clustering approaches in that it allows the number of groups to vary, uncovers subgroups and examines their association with an outcome of interest, and fits the model as a unit, allowing an individual's outcome potentially to influence cluster membership. The method is demonstrated with an analysis of survey data obtained from the National Survey of Children's Health. The approach has been implemented using the standard Bayesian modeling software, WinBUGS, with code provided in the supplementary material available at Biostatistics online. Further, interpretation of partitions of the data is helped by a number of postprocessing tools that we have developed.
منابع مشابه
A Bayesian Nominal Regression Model with Random Effects for Analysing Tehran Labor Force Survey Data
Large survey data are often accompanied by sampling weights that reflect the inequality probabilities for selecting samples in complex sampling. Sampling weights act as an expansion factor that, by scaling the subjects, turns the sample into a representative of the community. The quasi-maximum likelihood method is one of the approaches for considering sampling weights in the frequentist framewo...
متن کاملBayesian Inference for Spatial Beta Generalized Linear Mixed Models
In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...
متن کاملThe Family of Scale-Mixture of Skew-Normal Distributions and Its Application in Bayesian Nonlinear Regression Models
In previous studies on fitting non-linear regression models with the symmetric structure the normality is usually assumed in the analysis of data. This choice may be inappropriate when the distribution of residual terms is asymmetric. Recently, the family of scale-mixture of skew-normal distributions is the main concern of many researchers. This family includes several skewed and heavy-tailed d...
متن کاملPredicting waste generation using Bayesian model averaging
A prognosis model has been developed for solid waste generation from households in Hoi An City, a famous tourist city in Viet Nam. Waste sampling, followed by a questionnaire survey, was carried out to gather data. The Bayesian model average method was used to identify factors significantly associated with waste generation. Multivariate linear regression analysis was then applied to evaluate th...
متن کاملComparison of Maximum Likelihood Estimation and Bayesian with Generalized Gibbs Sampling for Ordinal Regression Analysis of Ovarian Hyperstimulation Syndrome
Background and Objectives: Analysis of ordinal data outcomes could lead to bias estimates and large variance in sparse one. The objective of this study is to compare parameter estimates of an ordinal regression model under maximum likelihood and Bayesian framework with generalized Gibbs sampling. The models were used to analyze ovarian hyperstimulation syndrome data. Methods: This study use...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Biostatistics
دوره 11 3 شماره
صفحات -
تاریخ انتشار 2010